-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIXED] LeafNode's queue group load balancing and Sublist.NumInterest #5982
Conversation
While writing the test, I needed to make sure that each server in the hub has registered interest for 2 queue subscribers from the same group. I noticed that `Sublist.NumInterest()` (that I was invoking from `Account.Interest()` was returning 1, even after I knew that the propagation should have happened. It turns out that `NumInterest()` was returning the number of queue groups, not the number of queue subs in all those queue groups. For the leafnode queue balancing issue, the code was favoring local/routed queue subscriptions, so in the described issue, the message would always go from HUB1->HUB2->LEAF2->QSub instead of HUB1->LEAF1->QSub. Since we had another test that was a bit reversed where we had a HUB and LEAF1<->LEAF2 connecting to HUB and a qsub on both HUB and LEAF1 and requests originated from LEAF2, and we were expecting all responses to come from LEAF1 (instead of the responder on HUB), I went with the following approach: If the message originates from a client that connects to a server that has a connection from a remote LEAF, then we pick that LEAF the same as if it was a local client or routed server. However, if the client connects to a server that has a leaf connection to another server, then we keep track of the sub but do not sent to that one if we have local or routed qsubs. This makes the 2 tests pass, solving the new test and maintaining the behavior for the old test. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@neilalexander I believe there was an issue with Sublist.NumInterest for queue subs since it looked like it was simply counting the number of groups, not the total number of queue subscriptions. Let me know if I misunderstood the intent. @derekcollison Please review the PR description and see if the choice I made is ok. |
You can review the first commit for the leafnode/sublist issues. The second is simply a bunch of missing "defer nc.Close()" and the likes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I notice looking back at #5918 that even before NumInterest()
was added, the Account.Interest()
function was still returning len(res.psubs) + len(res.qsubs)
, so I think the bug is not new and I've just ported it over to the new code as-is.
That said, I think what you're proposing here makes sense, particularly if we're relying on the number of subscriptions to balance in this way.
Something else that's just occurred to me is that @derekcollison Don't know whether we want to cherry-pick in |
@neilalexander let's pull those into 2.10.22 from main once this lands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Thanks @kozlovic
While writing the test, I needed to make sure that each server in
the hub has registered interest for 2 queue subscribers from the
same group. I noticed that
Sublist.NumInterest()
(that I wasinvoking from
Account.Interest()
was returning 1, even afterI knew that the propagation should have happened. It turns out
that
NumInterest()
was returning the number of queue groups, notthe number of queue subs in all those queue groups.
For the leafnode queue balancing issue, the code was favoring
local/routed queue subscriptions, so in the described issue,
the message would always go from HUB1->HUB2->LEAF2->QSub instead
of HUB1->LEAF1->QSub.
Since we had another test that was a bit reversed where we had
a HUB and LEAF1<->LEAF2 connecting to HUB and a qsub on both
HUB and LEAF1 and requests originated from LEAF2, and we were
expecting all responses to come from LEAF1 (instead of the
responder on HUB), I went with the following approach:
If the message originates from a client that connects to a server
that has a connection from a remote LEAF, then we pick that LEAF the
same as if it was a local client or routed server.
However, if the client connects to a server that has a leaf
connection to another server, then we keep track of the sub
but do not sent to that one if we have local or routed qsubs.
This makes the 2 tests pass, solving the new test and maintaining
the behavior for the old test.
Resolves #5972
Signed-off-by: Ivan Kozlovic ivan@synadia.com